Add coordinator side hash joins to analytics engine#21480
Add coordinator side hash joins to analytics engine#21480mch2 wants to merge 1 commit intoopensearch-project:mainfrom
Conversation
PR Reviewer Guide 🔍(Review updated until commit b88d249)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to b88d249 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit 1ad9dc0
Suggestions up to commit cc7e347
Suggestions up to commit be06fac
Suggestions up to commit ad66195
|
|
❌ Gradle check result for ad66195: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Persistent review updated to latest commit be06fac |
|
Persistent review updated to latest commit cc7e347 |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #21480 +/- ##
============================================
+ Coverage 73.44% 73.50% +0.05%
- Complexity 74426 74483 +57
============================================
Files 5970 5970
Lines 338267 338267
Branches 48753 48753
============================================
+ Hits 248451 248636 +185
+ Misses 70042 69826 -216
- Partials 19774 19805 +31 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Persistent review updated to latest commit 1ad9dc0 |
|
❌ Gradle check result for 1ad9dc0: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
|
Persistent review updated to latest commit b88d249 |
|
❌ Gradle check result for b88d249: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for b88d249: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Description
Summary
Brings PPL join coverage (inner, left outer, cross, lookup, lookup+rename) online through the analytics-engine route, plus fixes uncovered during the merge from
main.JoinCommandIT— 6 ITs covering inner, left outer, cross, lookup, lookup+rename (all green). appendcol skipped with@AwaitsFix— depends on window-function capability track (separate PR).OpenSearchJoinRule.matchesrelaxed from!leftKeys.isEmpty()toinfo.isEqui().CASTproject capability — addedScalar(CAST, ...)toDataFusionAnalyticsBackendPlugin.projectCapabilities(). LEFT OUTER JOIN injects CAST for nullable coercion; without this the project rule rejected the expression.DataFusionFragmentConvertor.rewire(Plan, Rel)previously reusedinnerRoot.getNames()for the wrapper's output names. When the wrapper's output schema differs from the inner plan's (e.g.AggregateoverJoinshrinks 4 columns → 1), this emitted a mismatchedPlan.Root.nameslist and DataFusion rejected with"Names list must match exactly to nested schema, but found N uses for M names". Fixedby taking the wrapper's own names.
RowProducingSink.close()no longer drops buffered batches —close()used to release and clear the buffer.ShardFragmentStageExecution.onShardTerminatedinvokesclose()before the SUCCEEDED transition; the PlanWalker's completion listener reads results during that transition → saw 0 rows. Terminal root sink is now a no-op on close; the downstream consumer (DefaultPlanExecutor#batchesToRows) already owns per-batch cleanup. Error path keeps leak safety via newreleaseUnread().Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.